23 research outputs found
Approximating Subadditive Hadamard Functions on Implicit Matrices
An important challenge in the streaming model is to maintain small-space
approximations of entrywise functions performed on a matrix that is generated
by the outer product of two vectors given as a stream. In other works, streams
typically define matrices in a standard way via a sequence of updates, as in
the work of Woodruff (2014) and others. We describe the matrix formed by the
outer product, and other matrices that do not fall into this category, as
implicit matrices. As such, we consider the general problem of computing over
such implicit matrices with Hadamard functions, which are functions applied
entrywise on a matrix. In this paper, we apply this generalization to provide
new techniques for identifying independence between two vectors in the
streaming model. The previous state of the art algorithm of Braverman and
Ostrovsky (2010) gave a -approximation for the distance
between the product and joint distributions, using space , where is the length of the stream and denotes the
size of the universe from which stream elements are drawn. Our general
techniques include the distance as a special case, and we give an
improved space bound of
Makespan Minimization via Posted Prices
We consider job scheduling settings, with multiple machines, where jobs
arrive online and choose a machine selfishly so as to minimize their cost. Our
objective is the classic makespan minimization objective, which corresponds to
the completion time of the last job to complete. The incentives of the selfish
jobs may lead to poor performance. To reconcile the differing objectives, we
introduce posted machine prices. The selfish job seeks to minimize the sum of
its completion time on the machine and the posted price for the machine. Prices
may be static (i.e., set once and for all before any arrival) or dynamic (i.e.,
change over time), but they are determined only by the past, assuming nothing
about upcoming events. Obviously, such schemes are inherently truthful.
We consider the competitive ratio: the ratio between the makespan achievable
by the pricing scheme and that of the optimal algorithm. We give tight bounds
on the competitive ratio for both dynamic and static pricing schemes for
identical, restricted, related, and unrelated machine settings. Our main result
is a dynamic pricing scheme for related machines that gives a constant
competitive ratio, essentially matching the competitive ratio of online
algorithms for this setting. In contrast, dynamic pricing gives poor
performance for unrelated machines. This lower bound also exhibits a gap
between what can be achieved by pricing versus what can be achieved by online
algorithms
The Bane of Low-Dimensionality Clustering
In this paper, we give a conditional lower bound of on
running time for the classic k-median and k-means clustering objectives (where
n is the size of the input), even in low-dimensional Euclidean space of
dimension four, assuming the Exponential Time Hypothesis (ETH). We also
consider k-median (and k-means) with penalties where each point need not be
assigned to a center, in which case it must pay a penalty, and extend our lower
bound to at least three-dimensional Euclidean space.
This stands in stark contrast to many other geometric problems such as the
traveling salesman problem, or computing an independent set of unit spheres.
While these problems benefit from the so-called (limited) blessing of
dimensionality, as they can be solved in time or
in d dimensions, our work shows that widely-used clustering
objectives have a lower bound of , even in dimension four.
We complete the picture by considering the two-dimensional case: we show that
there is no algorithm that solves the penalized version in time less than
, and provide a matching upper bound of .
The main tool we use to establish these lower bounds is the placement of
points on the moment curve, which takes its inspiration from constructions of
point sets yielding Delaunay complexes of high complexity
Zero-One Laws for Sliding Windows and Universal Sketches
Given a stream of data, a typical approach in streaming algorithms is to design a sophisticated algorithm with small memory that computes a specific statistic over the streaming data. Usually, if one wants to compute a different statistic after the stream is gone, it is impossible. But what if we want to compute a different statistic after the fact? In this paper, we consider the following fascinating possibility: can we collect some small amount of specific data during the stream that is "universal," i.e., where we do not know anything about the statistics we will want to later compute, other than the guarantee that had we known the statistic ahead of time, it would have been possible to do so with small memory? This is indeed what we introduce (and show) in this paper with matching upper and lower bounds: we show that it is possible to collect universal statistics of polylogarithmic size, and prove that these universal statistics allow us after the fact to compute all other statistics that are computable with similar amounts of memory. We show that this is indeed possible, both for the standard unbounded streaming model and the sliding window streaming model
Fast Fencing
We consider very natural "fence enclosure" problems studied by Capoyleas,
Rote, and Woeginger and Arkin, Khuller, and Mitchell in the early 90s. Given a
set of points in the plane, we aim at finding a set of closed curves
such that (1) each point is enclosed by a curve and (2) the total length of the
curves is minimized. We consider two main variants. In the first variant, we
pay a unit cost per curve in addition to the total length of the curves. An
equivalent formulation of this version is that we have to enclose unit
disks, paying only the total length of the enclosing curves. In the other
variant, we are allowed to use at most closed curves and pay no cost per
curve.
For the variant with at most closed curves, we present an algorithm that
is polynomial in both and . For the variant with unit cost per curve, or
unit disks, we present a near-linear time algorithm.
Capoyleas, Rote, and Woeginger solved the problem with at most curves in
time. Arkin, Khuller, and Mitchell used this to solve the unit cost
per curve version in exponential time. At the time, they conjectured that the
problem with curves is NP-hard for general . Our polynomial time
algorithm refutes this unless P equals NP
Online optimization with switching cost
We consider algorithms for "smoothed online convex optimization (SOCO)" problems. SOCO is a variant of the class of "online convex optimization (OCO)" problems that is strongly related to the class of "metrical task systems", each of which have been studied extensively. Prior literature on these problems has focused on two performance metrics: regret and competitive ratio. There exist known algorithms with sublinear regret and known algorithms with constant competitive ratios; however no known algorithms achieve both. In this paper, we show that this is due to a fundamental incompatibility between regret and the competitive ratio -- no algorithm (deterministic or randomized) can achieve sublinear regret and a constant competitive ratio, even in the case when the objective functions are linear
A Tale of Two Metrics: Simultaneous Bounds on Competitiveness and Regret
We consider algorithms for âsmoothed online convex optimizationâ
(SOCO) problems, which are a hybrid between online convex optimization (OCO) and metrical task system (MTS) problems. Historically, the performance metric for OCO was regret and that for MTS was competitive ratio (CR). There are algorithms with either sublinear regret or constant CR, but no known algorithm achieves both simultaneously. We show that this is a fundamental limitation â no algorithm (deterministic or randomized) can achieve sublinear regret and a constant CR, even when the objective functions are linear and the decision space is one dimensional. However, we present an algorithm that, for the important one dimensional case, provides sublinear regret and a CR that grows arbitrarily slowly
Online optimization with switching cost
We consider algorithms for "smoothed online convex optimization (SOCO)" problems. SOCO is a variant of the class of "online convex optimization (OCO)" problems that is strongly related to the class of "metrical task systems", each of which have been studied extensively. Prior literature on these problems has focused on two performance metrics: regret and competitive ratio. There exist known algorithms with sublinear regret and known algorithms with constant competitive ratios; however no known algorithms achieve both. In this paper, we show that this is due to a fundamental incompatibility between regret and the competitive ratio -- no algorithm (deterministic or randomized) can achieve sublinear regret and a constant competitive ratio, even in the case when the objective functions are linear